A DNA Sequence Compression Algorithm Based on Look-up Table and LZ77
نویسنده
چکیده
This article introduces a new DNA sequence compression algorithm which is based on look-up table and LZ77 algorithm. Combined a look-up-table-based pre-coding routine and LZ77 compression routine,this algorithm can approach a compression ratio of 1.9bits /base and even lower.The biggest advantage of this algorithm is fast, small memory occupation and it can easily be implementated.
منابع مشابه
Compression of large DNA databases
The thesis explores algorithms to efficiently store and access repetitive DNA sequence collections produced by large-scale genome sequencing projects. First, existing general-purpose and DNA compression algorithms are evaluated for their suitability for compressing large collections of DNA sequences. Then two novel algorithms for compressing large collections of DNA sequences are introduced. Th...
متن کاملA Biological sequence Compression based on Look up Table (LUT) using Complementary Palindrome of Fixed size
Data Storage costs have an appreciable proportion of total cost in the creation and analysis of DNA sequences. In particular, the increase in the DNA sequences is highly remarkable with compare to increase in the disk storage capacity. General text compression algorithms do not utilize the specific characteristics of DNA sequences. In this paper we have proposed a compression algorithm based on...
متن کاملOptimal Current Meter Placement for Accurate Fault Location Purpose using Dynamic Time Warping
This paper presents a fault location technique for transmission lines with minimum current measurement. This algorithm investigates proper current ratios for fault location problem based on thevenin theory in faulty power networks and calculation of short circuit currents in each branch. These current ratios are extracted regarding lowest sensitivity on thevenin impedance variations of the netw...
متن کاملA Sub-Optimal Look-Up Table Based on Fuzzy System to Enhance the Reliability of Coriolis Mass Flow Meter
Coriolis mass flow meters are one of the most accurate tools to measure the mass flow in the industry. However, two-phase mode (gas-liquid) may cause severe operating difficulties as well as decreasing certitude in measurement. This paper presents a method based on fuzzy systems to correct the error and improve the reliability of these sensors in the presence of two-phase model fluid. Definite ...
متن کاملSelf - Indexing Based on LZ 77 ? Sebastian
We introduce the first self-index based on the Lempel-Ziv 1977 compression format (LZ77). It is particularly competitive for highly repetitive text collections such as sequence databases of genomes of related species, software repositories, versioned document collections, and temporal text databases. Such collections are extremely compressible but classical self-indexes fail to capture that sou...
متن کامل